Tsunami: massively parallel homomorphic hashing on many-core GPUs

نویسندگان

  • Xiaowen Chu
  • Kaiyong Zhao
  • Zongpeng Li
چکیده

Homomorphic hash functions (HHF) play a key role in securing distributed systems that use coding techniques such as erasure coding and network coding. The computational complexity of HHFs remains to be a main challenge. In this paper, we present a massively parallel solution, named Tsunami, by exploiting the widely available many-core Graphic Processing Units (GPUs). Tsunami includes the following optimization techniques to achieve the highest ever hashing throughput: (1) using Montgomery multiplication and pre-computation to speed up modular exponentiations; (2) using a clean implementation of Montgomery multiplication in order to decrease the demand of registers and shared memory and increase the utilization ratio of GPU processing cores; (3) using our own assembly code to implement the 32-bit integer multiplication, which outperforms the assembly codes generated by the native compiler by 20%; (4) exploiting memory alignment and constant memory on GPUs to improve the efficiency of memory access. Integrating the above techniques, our Tsunami achieves a significant improvement over existing results. Specifically, the hashing throughput achieved by Tsunami on a GTX295 GPU is about 33 times of the existing solution on a Quad-core CPU. We also show that the hashing throughput grows almost linearly with the number of GPU cores.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Efficient Primitives and Algorithms for Many-core architectures

OF THE DISSERTATION Efficient Primitives and Algorithms for Many-core architectures Graphics Processing Units (GPUs) are a fast evolving architecture. Over the last decade their programmability has been harnessed to solve non-graphics tasks—in many cases at a huge performance advantage to CPUs. Unlike CPUs, GPUs have always been a highly parallel architecture—thousands of lightweight execution ...

متن کامل

On the utility of graphics cards to perform massively parallel simulation with advanced Monte Carlo methods

We present a case-study on the utility of graphics cards to perform massively parallel simulation with advanced Monte Carlo methods. Graphics cards, containing multiple Graphics Processing Units (GPUs), are self-contained parallel computational devices that can be housed in conventional desktop and laptop computers. For certain classes of Monte Carlo algorithms they offer massively parallel sim...

متن کامل

Static Memory Access Pattern Analysis on a Massively Parallel GPU

The performance of data-parallel processing can be highly sensitive to any contention in memory. In contrast to multi-core CPUs which employ a number of memory latency minimization techniques such as multi-level caching and prefetching, Graphics Processing Units (GPUs) require that the data-parallel computations reference memory in a deterministic pattern in order to reap the benefits of these ...

متن کامل

Efficient parallelization of the genetic algorithm solution of traveling salesman problem on multi-core and many-core systems

Efficient parallelization of genetic algorithms (GAs) on state-of-the-art multi-threading or many-threading platforms is a challenge due to the difficulty of schedulation of hardware resources regarding the concurrency of threads. In this paper, for resolving the problem, a novel method is proposed, which parallelizes the GA by designing three concurrent kernels, each of which running some depe...

متن کامل

Programming Massively Parallel Architectures using MARTE: a Case Study

Nowadays, several industrial applications are being ported to parallel architectures. These applications take advantage of the potential parallelism provided by multiple core processors. Many-core processors, especially the GPUs(Graphics Processing Unit), have led the race of floating-point performance since 2003. While the performance improvement of generalpurpose microprocessors has slowed si...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Concurrency and Computation: Practice and Experience

دوره 24  شماره 

صفحات  -

تاریخ انتشار 2012